Clustering-Based Categorical Data Protection

ثبت نشده
چکیده

The need of improving the privacy on public datasets is becoming more and more important because the number of public available datasets is growing very fast. This forced the continuous research to find better protection methods that prevent the disclosure of the entities or individuals in a dataset while preserving the data utility. In this paper we present a new approach for categorical data protection based on applying clustering to the dataset and then protecting each cluster. We show that this new approach allow us to have protections with better trade-off between data utility and individuals information disclosure. URL http://link.springer.com/chapter/10.1007%2F978-3-642-336270_7?LI=true# [7] Source URL: https://www.iiia.csic.es/en/node/54293 Links [1] https://www.iiia.csic.es/en/staff/jordi-mar%C3%A9s [2] https://www.iiia.csic.es/en/staff/vicen%C3%A7-torra [3] https://www.iiia.csic.es/en/bibliography?f[author]=490 [4] https://www.iiia.csic.es/en/bibliography?f[keyword]=508 [5] https://www.iiia.csic.es/en/bibliography?f[keyword]=507 [6] https://www.iiia.csic.es/en/bibliography?f[keyword]=497 [7] http://link.springer.com/chapter/10.1007%2F978-3-642-33627-0_7?LI=true#

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ارائه یک الگوریتم خوشه بندی برای داده های دسته ای با ترکیب معیارها

Clustering is one of the main techniques in data mining. Clustering is a process that classifies data set into groups. In clustering, the data in a cluster are the closest to each other and the data in two different clusters have the most difference. Clustering algorithms are divided into two categories according to the type of data: Clustering algorithms for numerical data and clustering algor...

متن کامل

Clustering Numerical and Categorical Data

Clustering is an important technique for data mining which allows us to discover unknown relationships in our data sets. Clustering algorithms that use metrics based on the natural ordering of numbers cannot be applied to categorical (non-numerical) data. In this tutorial we will review the main methods for numerical data clustering (K-Means, Hierarchical Clustering and Fuzzy CMeans) and then s...

متن کامل

Using Categorical Attributes for Clustering

The traditional clustering algorithms focused on clustering numeric data by exploiting the inherent geometric properties of the dataset for calculating distance functions between the points to be clustered. The distance based approach did not fit into clustering real life data containing categorical values. The focus of research then shifted to clustering such data and various categorical clust...

متن کامل

A Link-Based Cluster Collection Approach Combined Contagious Cluster With For Categorical Data Clustering

Data clustering is a challenging task in data mining technique. Various clustering algorithms are developed to cluster or categorize the datasets. Many algorithms are used to cluster the categorical data. Some algorithms cannot be directly applied for clustering of categorical data. Several attempts have been made to solve the problem of clustering categorical data via cluster ensembles. But th...

متن کامل

The "Best K" for Entropy-based Categorical Data Clustering

With the growing demand on cluster analysis for categorical data, a handful of categorical clustering algorithms have been developed. Surprisingly, to our knowledge, none has satisfactorily addressed the important problem for categorical clustering – how can we determine the best K number of clusters for a categorical dataset? Since categorical data does not have the inherent distance function ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017